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Abstract 

In this paper we introduce a new multilevel Monte Carlo (MLMC) estimator for multidi- 
mensional SDEs driven by Brownian motion. Giles has previously shown that if we combine a 
numerical approximation with strong order of convergence 0(At) with MLMC we can reduce 
the computational complexity to estimate expected values of functionals of SDE solutions with 
a root-mean-square error of e from 0{e~^) to 0{e^^). However, in general, to obtain a rate of 
strong convergence higher than 0(At^/^) requires simulation, or approximation, of Levy areas. 
In this paper, through the construction of a suitable antithetic multilevel correction estimator, 
we are able to avoid the simulation of Levy areas and still achieve an 0{At^) variance for smooth 
payoffs, and almost an 0{At^^^) variance for piecewise smooth payoffs, even though there is 
only 0{At^/'^) strong convergence. This results in an 0{e^^) complexity for estimating the value 
of European and Asian put and call options. 
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1 Introduction 



In many financial engineering applications, one is interested in the expected value of a financial 
derivative whose payoff depends upon the solution of a stochastic differential equation (SDE). 
Using a simple Monte Carlo method with a numerical discretisation with first order weak con- 
vergence, to achieve a root-mean-square error of e would require 0(e~^) independent paths, 
each with 0(e~^) time steps, giving a computational complexity which is 0(e"'^), [3]. 

Recently, Giles [U] introduced a multilevel Monte Carlo (MLMC) estimator which enables a 
reduction of this computational cost to 0(e^^(log e)^) for Lipschitz payoffs when using the Euler- 
Maruyama discretisation. Subsequent research using the Milstein discretisation [3] improved this 
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to 0(e~^) for a variety of non-smooth and path-dependent options based on the solution of a 
scalar SDE. However, a weakness of the Milstein discretisation is that in multiple dimensions it 
generally requires the simulation of iterated Ito integrals known as Levy areas, for which there 
is no known efficient method except in dimension 2 [H \W[ [IB] . 

We consider a general class of multi-dimensional SDEs driven by Brownian motion. Let 
(Q, J^, {Tt}t>o,^) be a complete probability space with a filtration {J't}t>o satisfying the usual 
conditions, and let w{t) be a D-dimensional Brownian motion defined on the probability space. 
We consider the numerical approximation of SDEs of the form 

dx{t) = f{x{t)) dt + g{x{t)) dw{t), (1.1) 

where x{t) G M<^ for each t>0, f £ C^iW^,^^), g G C2(M'^, R"'^^), and for simplicity we assume 
a fixed initial value xq G R"*. 

In this paper we are primarily concerned with estimating E[i-*(x(T))], the expected value of a 
payoff depending on the solution at a fixed time T, Defining the tensor hijk{x) as 




when using uniform timesteps At = T/N, the i component of the first order Milstein 
approximation ~ x(n At) has the form |12] 

D D 

= Xi^ri + fi{Xn) At + ^^gij{Xn) ^Wj,n + ^ /lijfc(X„) {Awj^n^Wk,n " %fc At - Ajk,n) 
j=l j,k=l 

(1.3) 

where is the correlation matrix for the driving Brownian paths, and ^jfc,n is the Levy area 
defined as 

Ajk,n= I {Wj{t)-Wj{tn)) dWk{t) - {wkit)-Wkit„))dWj{t). 

In some applications, the diffusion coefficient g{x) has a commutativity property which gives 
hijk{x) = hikj{x) for all i,j,k. In that case, because the Levy areas are anti-symmetric (i.e. 
Ajk,n = ~Akj,n)j it follows that hijk{Xn) Ajk^^ + hikj{Xn) Ai^j^^ = and therefore the terms 
involving the Levy areas cancel and so it is not necessary to simulate them. However, this only 
happens in special cases. 

Clark & Cameron proved for a particular SDE that it is impossible to achieve a better order 
of strong convergence than the Euler-Maruyama discretisation when using just the discrete 
increments of the underlying Brownian motion. The analysis was extended by Miiller-Gronbach 
|14j to general SDEs. As a consequence if we use the standard MLMC method with the Milstein 
scheme without simulating the Levy areas the complexity will remain the same as for Euler- 
Maruyama. Nevertheless, in this paper we show that by constructing a suitable antithetic 
estimator one can neglect the Levy areas and still obtain a multilevel correction estimator with 
a variance which decays at the same rate as the scalar Milstein estimator. 

We begin the paper by reviewing the multilevel Monte Carlo approach, introducing the idea of 
the antithetic estimator and bounding the behaviour of its variance under certain conditions. 
Because of its simplicity, we then consider Clark & Cameron's model problem, and prove that 
the antithetic path simulations do satisfy the required conditions to give an O(At^) variance 
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convergence for a smooth payoff. This then motivates the subsequent analysis for the general 
class of multidimensional SDEs. The appendix contains the detailed proofs of the key theorems. 

In this paper we restrict attention to financial applications with either a European payoff, 
dependent on the final value x{T), or an Asian payoff, dependent on the average of x{t) over 
the time interval [0,r]. It is proved that when the payoff is twice differentiable, with bounded 
derivatives, the rate of convergence of the multilevel correction variance is doubled from 0{At) 
to O(At^). If the payoff is Lipschitz, and twice differentiable almost everywhere, then the rate 
of convergence is reduced to 0(Ai^/^), but this is still sufficient to make the overall complexity 
0(e~^) to achieve a root-mean-square accuracy of e. 



2 Multilevel Monte Carlo estimation 

2.1 MLMC estimators 

In its most general form, multilevel Monte Carlo simulation uses a number of levels of resolution, 
£ = 0, 1, . . . , -L, with i = being the coarsest, and i = L being the finest. In the context of a 
SDEs simulation, level may have just one timestep for the whole time interval [0, T], whereas 
level L might have 2^ uniform timesteps. 

If P denotes the payoff (or other output functional of interest), and Pi denote its approximation 
on level I, then the expected value IE[Pi] on the finest level is equal to the expected value E[Po] 
on the coarsest level plus a sum of corrections which give the difference in expectation between 
simulations on successive levels, 

L 

E[PL]=nPo] + Y.nPi-Pe-i]- (2.1) 

e=i 

The idea behind MLMC is to independently estimate each of the expectations on the right-hand 
side of ()2.ip in a way which minimises the overall variance for a given computational cost. Let 
Yq be an estimator for E[Po] using Nq samples, and let Yi, £ > 0, be an estimator for E[P£ — P£_i] 
using A'^ samples. The simplest estimator is a mean of independent samples, which for £ > 
is 

yi = Ni'j^{pi-pu). (2.2) 

i=l 

The key point here is that Pj!—P^_-^ should come from two discrete approximations for the same 
underlying stochastic sample, so that on finer levels of resolution the difference is small (due to 
strong convergence) and so the variance is also small. Hence very few samples will be required 
on finer levels to accurately estimate the expected value. 

Here we recall the Theorem from [8j (which is a slight generalisation of the original theorem in 
|E]) which gives the complexity of MLMC estimation. 

Theorem 2.1. Let P denote a functional of the solution of a stochastic differential equation, 
and let Pi denote the corresponding level i numerical approximation. If there exist independent 
estimators Yi based on Ni Monte Carlo samples, and positive constants a, /?, 7, ci, C2, C3 such 
that min(/3,7) and 
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i) |E[P,-P]| <ci2-"^ 

E[P^-P^_i], i>0 
Hi) Y[Ye] < C2N-^2~^^ 

iv) Ci < c^Ni2"'^, where Ce is the computational complexity ofYi 



then there exists a positive constant C4 such that for any e < e ^ there are values L and for 
which the multilevel estimator 



£=0 



< e 



has a mean- square- error with bound 

MSB = E 

with a computational complexity C with bound 

C4e-2, /3>7, 
C4e-2(loge)2, /3 = 7, 



C < i 



In (|2.2p we have used the same estimator for the payoff Pi on every level i, and therefore (|2.ip 
is a trivial identity due to the telescoping summation. However, in [5] Giles numerically showed 
that it can be better to use different estimators for the finer and coarser of the two levels being 
considered, P/ when level £ is the finer level, and Pf when level i is the coarser level. In this 



case, we require that 



so that 



E[P/]=E[P/] for ^ = 1,...,L, 



E[P[] = E[pf] +J2nPl - PLi] 



(2.3) 



The MLMC Theorem is still applicable to this modified estimator. The advantage is that it 
gives the flexibility to construct approximations for which P^ — Pi_i is much smaller than the 
original Pg — Pf_i, giving a larger value for (3, the rate of variance convergence in condition Hi) 
in the theorem. 



2.2 Antithetic MLMC estimator 



Based on the well-known method of antithetic variates (see for example |10j). the idea for the 
antithetic estimator is to exploit the flexibility of the more general MLMC estimator by defining 
P£_i to be the usual payoff P{X^) coming from a level i — 1 coarse simulation X'^, and define 
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Pi to be the average of the payoffs coming from an antithetic pair of level £ 

simulations, X-^ and X". 

will be defined in a way which corresponds naturally to the construction of X"^. Its antithetic 
"twin" X" will be defined so that it has exactly the same distribution as X-^, conditional on 
X'^, which ensures that E[P{X^)] = E[P{X'')] and hence ([23]) is satisfied, but at the same time 



X' 



and therefore 



P{Xf) - P{X^] 



(P(X") - P(X^)) 



so that i (P(X/) + P(X")) « P(X'=). This leads to i (P(X/) + P(X'^)) - P(X^) having a 
much smaller variance than the standard estimator P{Xf) — P{X'^). 

We now present a lemma which motivates the rest of the paper by giving an upper bound on 
the convergence of the variance of ^ (P(X-^) + P(X")) — P[X^). 



Lemma 2.2. //P G C'^{I 



then for p > 2, 



and there exist constants Li, L2 such that for all x G 



dP 



dx 



d^P 



dx^ 



E 



i(P(X-^) + P(X'^)) -P(X' 

< 2P~^ L\ E [ i(X-^+X") -X^ + 2-(P+i) Ll E 



X-^ -X" 



2p 



Proof. If we define X^ = i(X-^+X"), then a Taylor expansion gives 



P(X^) = P(X^) + ^ (X^) (X^- x^) + i(x/- x^)^ 1^(6) (X/- X^) 

for some ^1 on the line between X and X-*. Performing a similar expansion for P(X") and 
then averaging the two, the linear terms cancel and one obtains 



i(P(X^) + P(X'^)) = P(X^) + i(X^- 


X^] 


dx^ ^^'^ 


(X/- 


X^] 


+ 


-xf 




(X"- 


-xf 


= P(X^) + |(X/- 


X") 




(X/- 


X") 



for some ^3 on the line between X"" and X-^, due to the mean value theorem. We then obtain 



i(P(X-^) + P(X"))-P(X^ 



for some ^4 on the line between X^ and X*^. Hence 



i(P(X^) + P(X"))-P(X^ 



< ^1 



x-^-x'^ 



+ \L2 



X^-X" 
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and the final result follows from the standard inequality 



N 
n=l 



N 



< 



n=l 



and then taking the expectation. 



(2.4) 
□ 



In the multidimensional SDE applications considered in this paper, we will show that the Mil- 
stein approximation with the Levy areas set to zero, combined with the antithetic construction, 
leads to - X'' = 0{At^/^) but X^ - X^ = 0{At). Hence, the variance V[i(P/+P;'^) - 
is 0{At'^), which is the order obtained for scalar SDEs using the Milstein discretisation with 
its first order strong convergence. We first show this for the simple Clark &; Cameron model 
problem which can be analysed in detail. We then extend the analysis to a general class of 
multidimensional SDEs. 



3 Clark- Cameron Example 



3.1 Clark-Cameron analysis 

The paper of Clark and Cameron [2] addresses the question of how accurately one can approxi- 
mate the solution of an SDE driven by an underlying multi-dimensional Brownian motion, using 
only uniformly-spaced discrete Brownian increments. Their model problem is 

dxi{t) = dwi{t) 

dx2{t) = Xi{t)dw2{t), (3.1) 

with x(0) = y(0) = 0, and zero correlation between the two Brownian motions wi{t) and 'W2{t). 
These equations can be integrated exactly over a time interval where t„ = nAt, to 

give 

Xlitn+l) = Xi{tn) + Awi^n 

X2itn+l) = X2itn) + Xi{tn)Aw2,n + ^^Wi,n^W2,n+ ^Ai2,n (3.2) 

where Awi^n = Wiitn+i) — Wi{tn)-, and Ai2,n is the Levy area defined as 

rtn+i rtn+1 
Al2,n= {wi{t)-Wi{tn)) dW2{t) - {w2{t) -W2{tn)) dwi{t). 

This corresponds exactly to the Milstein discretisation presented in ()1.3p . so for this simple 
model problem the Milstein discretisation is exact. 

The point of Clark and Cameron's paper is that for a given set of discrete Brownian incre- 
ments, the value for xi{tn) is determined exactly for all n, but the value for X2{tn) depends on 
the unknown Levy areas. Since E[Ai2,n | Awi^m Aw2,n] = 0) the conditional expected value is 
given by ()3.2p with the Levy areas set to zero. In addition, it follows that for any numerical 
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approximation X{T) based solely on the set of discrete Brownian increments Aw, 



E[(x2(r) - X2{T)f] = n E[(x2(r) - X2{T)f I Aw] ] 
> E[ V[x2(r) I Aw] ] 

N-l 
n=0 

= iTAt. 

Hence, one cannot achieve better than 0(At^/^) strong convergence, and the mean square error 
is minimised when the inequality in the above equation is an equality, which is when 

X2{T)=E[x2{T)\Aw], 

which is achieved by setting the Levy areas set to zero. 



3.2 Antithetic MLMC estimator 

We define a coarse path approximation X'^ with timestep At by neglecting the Levy area terms 
to give 

Xi^n+l = ^l,n + ^^l,n 

Xln+1 = Xl^ + Xl^Aw2,n + \Awi^nAw2,n (3.3) 

This is equivalent to replacing the true Brownian path by a piecewise linear approximation as 
illustrated in Figure [TJ 

Similarly, we define the corresponding two half-timesteps of the first fine path approximation 



W 




Figure 1: Brownian path and approximations over one coarse timestep 
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Xf by 



1 

2 ' 2 



in which = — w{tn), = w{tn+i) — w{t^_^i) are the Brownian increments 

over the first and second halves of the coarse timestep, and so Au)„ = 6wn + Sw^.i. Using this 
relation, the equations for the two fine timesteps can be combined to give an equation for the 
increment over the coarse timestep, 

Kn+l = Xln + Xl^/\w2,n + \^Wi^n^W2,n (3.4) 



The antithetic approximation X!^ is defined by exactly the same discretisation except that the 
Brownian increments 5wn and 5w^_^i are swapped, as illustrated in Figured! This gives 

Xln+l = ^2,n+i + ^l,n+i '^^2,n + ^ Swi^n Sw2,n, 

and hence 

X^n+l = Xl^ + Xl^Aw2,n + ^Awi,nAw2,n (3.5) 

Swapping 5wn and 5w^^i does not change the distribution of the driving Brownian increments, 

and hence X"" has exactly the same distribution as X^ . Note also the change in sign in the last 
term in ()3.4p compared to the corresponding term in (|3.5p . This is important because these two 
terms cancel when the two equations are averaged. 

These last terms correspond to the Levy areas for the fine path and the antithetic path, and 
the sign reversal is a particular instance of a more general result for time-reversed Brownian 
motion, [11]. If {wt,0 < t < 1) denotes a Brownian motion on the time interval [0, 1] then the 
time-reversed Brownian motion {zt,0 <t < 1) defined by 

zt = wi- wi-t, (3.6) 

has exactly the same distribution, and it can be shown that its Levy area is equal in magnitude 
and opposite in sign to that of Wt- 
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Lemma 3.1. If xl, X"^ and X^ are as defined above, then 

x{ „ = x?„ = X? „, i (xl + „ I = Xo„, yn < N 



and 



E 



^2,N ~ ^2,N 



■T{T+At) At^. 



Proof. Comparing p.3p . (j3.4p and ()3.5p . it is clear that x(^, ^in ^^'^ -^in satisfy the 
same difference equation and so are equal. Given this, averaging the equations for X2 „ and 



^2^^ gives the same difference equation as for and so therefore ^ (^-^2 n + ^2,nj — ^2,n- 

Finally, summing the difference of the equations for ^2 „ and X!^^ gives 



N-l 

^2,N - ^2,N = Yl ('^^1'" '^^2,n+i " ^^2,n Sw^,n+ 
n=0 

Since the 6wj^n are all i.i.d. Normal variables with variance ^At, it is easily shown that 

\2i _ 1 A +2 



2 



and it then follows that 

E[{xl^-Xl^r] = {lAt'f ^^i^i^ + lAt^AT = lT{T+At)At\ 
In the above derivation, when expanding (X2 ^ — ^)^, the first contribution comes from terms 

of the form {5wi^rn Sw2Jn_^_l-5w2,m ^Wi.m+l)'^i^'Wl,n ^'^2,n+^ -5w2,n 5Wi,n+\)'^ m ^ U, while 

the second contribution comes from terms of the form {6wi^n^W2n+i. ~ ^'W2,n^w^^j^i)'^. All 
other terms have zero expectation. □ 

Combining the above result with Lemma 12.21 for p = 2 gives a second order bound on the 
multilevel estimator variance for payoffs satisfying the required smoothness conditions. 



4 General theory 



4.1 Milstein discretisation 



In this section we extend the analysis of the Clark-Cameron example to general the multidimen- 
sional SDE (jl.ip . We make the standard assumptions that /, g and h have a uniform Lipschitz 
bound, and so have uniformly bounded first derivatives. In addition, we make the assumption 
that / and g have uniformly bounded second derivatives. More formally 
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Assumption 4.1. Let f G C^^i^rf^jgd) ^^^^ g g C^(Md^^dxm-^_ j.^^^^ ^^^^^^ ^ constant L such 
that for any x € M"^, and for all 1 < i < d and 1 < j,k,l < m 



^f^, . 

dxi 



< L, 



dgi 



dx 



(x) 



< L, 



dx, 



d'f^ 



dxkdxi 



ix) 



< L, 



dxkdxi 



Let us recall that the general Milstein scheme [12] has the form 

D D 

Xi^n+l = Xi^n + fi{Xn) At + ^ QijiXn) Awj^n + ^ /lyfc(^n) {^Wj^n ^Wk,n " ^jk^t - Ajk,n) ■ 

j=l j,k=l 

(4.1) 

As in the Clark-Cameron example, we drop the Levy areas terms, and instead use the truncated 
Milstein approximation 

D D 

Xi^n+l = Xi^n + fi{Xn) At + ^ QijiXn) Awj^n + ^ hijk{Xn) {Awj^n Awk,n " ^jk^t) . (4.2) 

j=l j,k=l 



Under Assumption 14.11 it is a standard result that the moments of the general Milstein ap- 
proximation Xn are bounded, and A„ strongly converges to the solution of the SDE (jl.ip : this 
remains true for the truncated Milstein approximation as stated in the following Lemma. 

Lemma 4.2. For p >2, there exists a constant Kp, independent of the time step, such that 



E 



max ||A„P 

0<n<Af 



< K.. 



and 



E 



max II A„ - x{tnW 

0<n<N 



< KpAtP/^. 



Proof. The proof in |14] follows the standard method of analysis in references such as |12t[T5^j. □ 

Hence, the rate of strong convergence is 0{At^/^), which is no better than the Euler-Maruyama 
discretisation. Nevertheless, we will show that the antithetic multilevel estimator has a variance 
which converges to zero at the same rate as the full Milstein approximation. 

Corollary 4.3. For p >2, there exists a constant Kp, independent of the time step, such that 



E 



max |/,(A„)r 

0<n<N 



< Kp, E 



max |5ij(A„)|P 

0<n<A' 



<Kp, E 



max |/iijfc(A„)|P 



0<n<Af 



<Kp, 



for all 1 < i < d and 1 < j,k < D. 



Proof. The bounded first derivatives of f{x),g{x),h(x) imply that they grow no faster than 
linearly as ||x|| oo, and the result then follows from the bound in Lemma 14.21 □ 



In order to derive appropriate bounds on the antithetic estimator we also need the following 
lemma. 



10 



Lemma 4.4. For p > 2, there exists a constant Kp, independent of the time step, such that 



max E 

0<n<Af 



< KpAtP/'^. 



Proof. We start from (j4.2|) and inequality (|2.4p which gives 



E [\Xi^n+i - Xi^nH < E U{Xn) At\P] + E 



D 



+ E 



D 



j,k=l 



The first term on the right has a 0{AtP) bomid due to the uniform bound on E For 
the second term we note that because Awj^n is independent of Xn then 

PI 



E 



D 



D 



<DP~'^E[\g,,{Xn)n E[|A^z;,,„n 



and we obtain a O^At^^'^) bound due to the uniform bound on E [|(7jj and standard 

resuhs for the moments of Brownian increments. The third term is handled in a similar way 
and has a 0{At^) bound. 

Together these give a 0(AtP/2) bound for E [ \Xi, n+i — Xj^„,|^] for each i, and hence also for 



IE [ ll^n+i ~ Xn\ 



□ 



4.2 Antithetic MLMC estimator 

Using the coarse timestep At, the coarse path approximation X^, is given by the Milstein 
approximation without the Levy area term, 

D D 

Xln+l = Xln + fi{X^) At + Y, mAK) ^Wj,n + ^iAK) {^Wj^n ^Wk,n " %fe At) . 

j=l j,k=l 

The first fine path approximation xJi uses the corresponding discretisation with timestep At/2, 

D 



X 



f 



X. 



f 

i,n+l 



Xln + fiiXl) At/2 + Y9viXl) 5Wj,n 

i=i 

D 

+ Y ^ijk{Xl) {6Wj^n 5Wk,n " %fe At/2) , 
j,k=l 

i=i 

D 

+ Y ^ijk{Xl^,) (<5u;j>+i ^Wk,n+\ - ^jk At/2 



(4.3) 



(4.4) 



j,k=l 
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in which 

6Wn = 1«(i„+i ) - w{tn), (^^n+i = "^(Wl) " (^•^) 

are the Brownian increments over the first and second halves of the coarse timestep, and so 

AWn = SWn + l^tfn+i • 

The antithetic approximation is defined by exactly the same discretisation except that the 
Brownian increments 5wn and dw^^i are swapped, so that 

D 



D 

+ h,jkiX^) {Sw^,n+l ^^k,n+\ - AV2) 



j,fe=l 



D 



(4.6) 



D 



+ Y hijk{Xl^i) {dWj^n SWk,n - ^jk At/2) . 
j,k=l 

Since 5wn and Sw^^i are independent and identically distributed, X" has exactly the same 

distribution as X^ , and hence E[P(X'*)] = E[P(X-^)]. In addition, the following lemma follows 
directly from Lemma 14.21 and Lemma |4.4[ 



Lemma 4.5. Let X^ and X"' be as defined above. Then for p > 2, there exists a constant Kp, 
independent of the time step, such that 



E 
E 



max 

0<n<N 



xl 



max WX^W'P 

0<n<N 



< Kj,, max E 

0<n<Af 



< Kj,, max E 

0<n<N 



< KpAtP/^, 



< KpAtP/^. 



4.3 Numerical analysis 



The analysis is presented as a sequence of lemmas and theorems, with the proofs deferred to 
the Appendix. The outline is as follows: 



• Lemma 14.61 bounds \\Xn — X^\\ over a coarse timestep; 

• Lemma 14.71 gives a representation of the discrete equation for Xn over a coarse timestep, 
and Corollarv 14.81 gives the corresponding representation for X^; 

• Lemma 14.91 gives a representation of the discrete equation describing the evolution of the 
average X^^ = ^{X-l+X^) over a coarse timestep; 

• Theorem mU] bounds \\xi-X^\\ over a coarse timestep. 
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Lemma 4.6. For all integers p > 2, there exists a constant Kp such that 



E 



max \\Xl-X^\\P 

0<n<N" " 



Lemma 4.7. Difference equation ()4.4p for Xn can be expressed as 

D D 



^{n+l = + hiXfj At + Aw,,n + Yl ''^MXl) ( At.,, „ At.,, „ - % 



j,k=i 



D 



j,k=l 

where E[m/ | = 0, and for any integer p> 2 there exists a constant Kp such that 



max E 

0<n<A' 



< KpAt^P/^, max E \\\nI\\p] < KpAt^P. 



0<n<N 

Corollary 4.8. Difference equation ()4.6p for X^ can be expressed as 

D D 



j=l j,k=l 

D 

j,k=l 
' i.n ' i.n' 



where E[M^ \^n] = 0, and for any integer p> 2 there exists a constant Kp such that 



max E 

0<?i<Af 



\M, 



a lip 



<KpAt^P/^, max E [||Ar^||H <KpAt^P. 



0<n<N 



Lemma 4.9. The difference equation for = ^{X-l+X'^^) can be expressed as 



x' 



D D 

i,n+l = + fi (Xi) At + J2 9ij (Xi) AWj^n + '^Vk (^^ ) i^^ln Awk,n " 

j=l j,k=l 

+ Mi„+Ni„, 



where E[M„ \J^n] = 0, and for any integer p> 2 there exists a constant Kp such that 



max E 

0<n<Af 



\MJ\P 



<K„At^P/^, max E flliV^f 1 < K„ At^P . 



0<n<N 

Theorem 4.10. For all p > 2, there exists a constant Kp such that 



E 



max \\xl - X';,\\P 

0<n<N 



< Kp AtP. 
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4.4 Piecewise linear interpolation analysis 



The piecewise linear interpolant X^{t) for the coarse path is defined within the coarse timestep 
interval [tk,tk+i] as 

Likewise, the piecewise linear interpolants X-^ (t) and X°'{t) are defined on the fine timestep 
[ifc,ifc+i] as 



xf{t)^{i-\)xl + xxl^„ x-{t)^{i-x)x-,+xx-_^,, X 



t — tk 



and there is a corresponding definition for the fine timestep , tfc+i]. 

The proofs of the next two lemmas are in the Appendix, and the Theorem then follows directly. 
Lemma 4.11. For all integers p > 2, there exists a constant Kp such that 



max E 

0<n<Ar 



\x^ 1 iir 



< Kf,AtP/\ 



Lemma 4.12. For all p > 2, there exists a constant Kp such that 



max E 

0<n<Ar 



< Kp AtP, 



where X%t^_^.) = \{X-^ + X<^ j^^) is the midpoint value of the coarse path interpolant. 
Theorem 4.13. For all p > 2, there exists a constant Kp such that 



sup E 

0<t<T 



sup E 

0<t<T 



\\X^{t)-X''{t)\\p] < KpAtP/^ 



x\t)-X''{t) n < KpAtP, 



where X (t) is the average of the piecewise linear interpolants X' (t) and X'^{t). 



5 European and Asian payoffs 



5.1 European options 



In the case of payoff which is a smooth function of the final state x{T), taking p = 2 in Lemma 
2.2\ p = 4 in Lemma 14.61 and p = 2 in Theorem I4.10^ immediately gives the result that the 
multilevel variance 

' (P{X{,) + P{X%))-P{X%) 



V 

has an 0{At'^) upper bound. This matches the convergence rate for the multilevel method for 
scalar SDEs using the standard first order Milstein discretisation, and is much better than the 
0{At) convergence obtained with the Euler-Maruyama discretisation. 
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However, very few financial payoff functions are twice differentiable on the entire domain W^. 
A more typical 2D example is a call option based on the minimum of two assets, 

P(x(T)) = max(0,min(xi(r),X2(T)) - K) , 

which is piecewise linear, with a discontinuity in the gradient along the three lines (s, K), (K, s) 
and (s, s) for s > K. 

To handle such payoffs, we introduce a new assumption which bounds the probability of the 
solution of the SDE having a value at time T close to such lines with discontinuous gradients, 
and then formulate a theorem to show that the multilevel variance which results from using the 
antithetic estimator has an upper bound which is almost 0(At^/^). 

Assumption 5.1. The payoff function P E C(M"',M) has a uniform Lipschitz bound, so that 
there exists a constant L such that 

\P{x) - P{y)\ < L \x - y\ , Vx,yGM^ 

and the first and second derivatives exist, are continuous and have uniform bound L at all points 
X ^ K, where K is a set of zero measure, and there exists a constant c such that the probability 
of the SDE solution x(T) being within a neighbourhood of the set K has the bound 

P ( min||2;(r) - y|| < e ) < c e, Ve > 0. 



In a ID context. Assumption 15.11 corresponds to an assumption of a locally bounded density for 
x{T). 



Theorem 5.2. // the SDE satisfies the conditions of Assumption and the payoff satisfies 
A s sumption \ 5. 1\ then 



E 



0(At3/2- 



for any 5 > 0. 



Proof. We start by noting that 



< 2 E 
+ 2 E 



(i(p(Ai)-p(A^; 



The second term on the right-hand-side has an O(At^) bound due to the uniform Lipschitz 
bound for the payoff, together with the result from Theorem 14. 101 for p = 2. 

The objective now is to prove that the first term has a o 

(At^/s-'J) bound for any 5 > 0. The 
analysis follows the approach used in To prove this for a particular value of 6, we define 
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e = At^/^ ■'Z^, and consider the three events 



A = 
B = 
C = 



min ||x(r) — y\\ < £( , 
yeK J 

||x(T)-Xj(,||>ie}, 



Using 1a to indicate the indicator function for event A, and A^ to denote the complement of 
A, we have 



E 



^,{P{xf,) + P{Xf,))-P{xi) 



= E 
+ E 



\{P{xf,) + P{X%)) - P(xi))' lA.niJcnc^ 



Looking at the first of the two terms on the right-hand-side, then Holder's inequality gives 



E 



\{P{Xl,) + P{X%))-P{X^^)) Wuc 



< E 



(\{P{xl,) + P{X%))-P{X^j,) 



i/p 



(P(^) + P(S)+P(C7))^/'? 



for any p,q>l, with p + 



-1 I ^-1 



1. The Markov inequality gives, 



P)<E x{T)-X 



for any m > 1. Using the strong convergence property from Lemma 14.21 and the definition of 
e, we can take m to be sufficiently large so that 



and hence there exists a constant ci such that ¥{B) < ci e. Using Lemma 14. 6[ one can obtain 
a similar bound P(C) < C2 e, and then q can be chosen sufficiently close to 1 so that 



(P(^) + P(B)+P(C7))i/5 < (1 + ci + C2)^/''At(i/2-5/2)/g = o(Ati/2-5)_ 



Since 



i(p(XjC) + Pix%)) - p{x^^) = Unxf,) - Pixi)) + \{P{X%) - P{X^^)l 

the uniform Lipschitz bound gives 



E 



\{P{Xi,) + P{X%))-P{X 



N. 



1/p 



2p 



1/p 



< csAt, 



for some constant C3 due to Lemma 14.61 and hence 



E 



i(P(Xj^) + P™)-P(xi)) lAuBuc 



o(At3/2-5). 
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Lastly, we consider the second term 



E 



(P(X^) + PTO))-P(X^)) 1a 



Given a path sample oj € {B'^ n C^), if the straight line between and X'^ contains a point 
■(rll and ||x(r) - X{^\ 



y ^ K, then \\y — xj^j\\ and ||x(T) — -'^j^rll are both less than e/2, and hence — y\\ < e. 



Thus, for a path sample u G {A'^ n -B'^ Pi C"^), the straight line between X^^ and Xf^ does not 
contain any points in K. It is therefore possible to perform a second order truncated Taylor 
expansion as in the proof of Lemma 12.21 and deduce that there exists a constant C4 such that 



E 



{P{Xl,) + P{X%))-P{X^^)) 1a 



< C4E 



which has an O(At^) bound due to Lemma W. 



□ 



5.2 Asian payoffs 



For an Asian option, the payoff depends on the average 

rT 



T-^ I x{t)dt. 
Jo 



'0 

This can be approximated by integrating the appropriate piecewise linear interpolant which 
gives 



X'„ 



xf 



ave 



N-1 



^ [ X^{t)dt = N~'Y.UK + ^n+i), 

■^0 n=0 

T N-1 

^ T-^J X-{t)dt = N-^Y.l(^n + ^K+l+^n+l)- 



n=0 



Due to Holder's inequality, 



E 



-y-f "Y^a 



< T'^ / E 
^0 



Xfit)-X''{t) 



dt < supE 

[0,T] 



xfit)-x''it) 



and similarly 



E 



< sup E 

[0,T] 



x\t)-x^{t) 



Hence, if the Asian payoff is a smooth function of the average, then taking p = 2 in Lemma [221 
p = 4 in Corollary 14.111 and p = 2 in Corollary I4.12^ again gives a second order bound for the 
multilevel correction variance. 
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This analysis can be extended to include payoffs which are a smooth function of a number of 
intermediate variables, each of which is a linear functional of the path x{t) of the form 





for some vector function g{t) and measure fj,{dt). This includes weighted averages of x{t) at a 
number of discrete times, as well as continuously-weighted averages over the whole time interval. 

As with the European options, the analysis can also be extended to payoffs which are Lipschitz 
functions of the average, and have first and second derivatives which exist, and are continuous 
and uniformly bounded, except for a set of points K of zero measure. 

Assumption 5.3. The payoff P € C(M'^,M) has a uniform Lipschitz bound, so that there exists 
a constant L such that 

\P{x) - P{y)\ < L \x - y\ , Vx,yGM'^, 

and the first and second derivatives exist, are continuous and have uniform bound L at all points 
X ^ K, where K is a set of zero measure, and there exists a constant c such that the probability 
of Xave being within a neighbourhood of the set K has the bound 



( min ||xa^,e — y\\ < £] < c e, Ve > 0. 
\yeK J 



Theorem 5.4. // the SDE satisfies the conditions of Assumption [TTTj and the payoff satisfies 
A s sumption \ 5. 31 then 

2" 



E 

for any 6 > 0. 



(i(P(XL) + P{X^.e)) - PiKve))' 



o{At 



3/2-5\ 



6 Conclusions 



In this paper we have constructed a new antithetic multilevel Monte Carlo estimator for multi- 
dimensional SDEs, with a variance which is O(At^) when the payoff function is smooth, and 
almost an 0{At^^'^) when it is Lipschitz and piecewise smooth. The algorithm is very easy to 
implement; all that is required is to calculate a second fine path for which the odd and even 
Brownian increments are swapped. 

In the European and Asian payoff cases considered in this paper, it reduces the computational 
complexity for an e root-mean-square error to 0(e~^), compared to 0(e~^(log e)^) for the mul- 
tilevel method using the Euler-Maruyama discretisation, and 0(e~^) for the standard Monte 
Carlo method. Furthermore, by ensuring that the dominant computational effort is on the coars- 
est levels (since /3 > 1), it is now feasible to obtain further improvements using quasi-Monte 
Carlo techniques [^. 

In a second paper, we will extend the analysis to cover digital and barrier options. The im- 
provements from an extended version of the antithetic treatment are then more substantial, 
improving the complexity from 0(e~^/^) to approximately 0{e~^). 
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A Proof of main results 



A.l Proof of Lemma 14.61 



Proof. Conditional on the Brownian increments Aw for the coarse path X^, the Brownian 
increments for X-^ and X" have exactly the same distribution, and therefore X^ — X^ has 
exactly the same distribution as xJi—X^. Hence we obtain, using inequality ()2.4p . 



E 



max \\Xl-X^\\P 

0<n<N" " 



< 2^-1 I E 
= 2P E 

< 22P-1 f E 



max \\Xl-XS^\\P +E 

0<n<iv" " 



max \\Xl-X^\\P 

0<n<N" " 



max \\X^-X^\\P 

0<n<N 



meo, \\Xl - xitnW 

0<n<N 



+ E 



max \\X^ - x(tn)\\P 

0<n<N 



The desired result then follows from the strong convergence property in Lemma 



□ 



A. 2 Proof of Lemma 14.71 and Corollary 14.81 

Proof. Combining the two equations in (j4.3p . and using the identity 

together with the definition of /ijjfc in (jl.2p gives, after considerable re- arrangement. 



where 



D 



= Xl^ + f-{Xl)At + Y.g,,{Xl)Aw,,n 



D 



Rj 



M 



(2) 



M 



(3) 



+ ^ hijk{Xl){Awj^nAwk,n-^jk^t) 
j,k=l 
D 

j,k=l 

+ R,^^ + M^^+M^% 



fr{Xl^.)-f:{Xl)^ At/2, 
D I D \ 

j=i \ ^ k=i J 

D . . 

E ( h^Mxl^i) - h^,k{Xf) ) [dw.^^^Sw,^^^^ - %fcAt/2) . 
j,k=i ^ ' ^ 
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Considering Rn, a Taylor expansion gives 

d 



(^in+i 



for some which hes on the Une between Xn and -i^. Hence, Rn can be spht into two parts, 
Rn = M^^^ + Ar„, where 



j=i k=i ^ 

and 



<n = E E #(^^) 5.-fc(^n0 Sw,,n At/2, 



Ni,n = E ( + E f^M^l) i^Wk,nSwi,n - j At/2 

j=i y k,l=i j 



j,k- 



Considering Mn , a Taylor expansion gives 



+ 5 (^f,„+i -^'„) . 

for some ^2 on the line between Xn and X^ ^ , and therefore 



D d / D 



^,2 = EE|?(^n) (/fc(^^)AV2 + E^«-(^n)('^«^'."'^^-.»-^'-^^^^^ 

\Y. t ^fe) (<„+i ^-.n.^- 

j — 1 kd — 1 



(3) 

Finally, considering M„ we have 



D . X 

^5 = E - ^^i'^(^n^) {Sy^j,n^Sw,^n+l ' ^JkAt/2) 

j,k=l ^ ' 



for some ^3 on the line between Xn and X-^ ^ . 

n+2 
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Setting mI = ui^^ + mP + Mi^\ it is clear that E[m/ \Tn]=0 since 5wn is independent of 



f f f 

Xn, and Sw^.i is independent of X„ and X i . 



All that remains is to bound the mae; nitude of E[ ||m/||p] and E[ \\N^ \\p]. Looking at two of the 

(2) 

terms in M^^, for example, the uniform bound on the first derivatives of g, together with the 
fact that dw„,i is independent of both xJi and Swn leads to 



r/l 



E 



dxk 



hklmiXl) 



E 



\2p 



E 

















and the uniform bound on the second derivatives of g, together with the fact that Sw„,i is 

2 

independent of both xl and X^ , leads to 

n+2 



E 



dxkdxi 



1 . 



< LP E 



X^ 



xl 



2p 



E 



p 



Combining the uniform bound on E 



J, 2p 



from Corollary 14.31 with the bounds from 



Lemma l4.4t and standard results for the moments of Brownian increments, gives the required 
0(At^^'/^) bound for each of the two terms considered. 

Deriving similar bounds for the other terms in and A^-^, and combining them using (|2.4p . 
eventually gives the desired bounds for both E[ ||M/p] and E[ ||a/||^']. 

The proof is almost exactly the same for Corollarv l4.8i The sign change in the second line of the 
equation in the statement of the Corollary is due to the swapping of the Brownian increments 
for the first and second halves of the timestep. 

□ 



A. 3 Proof of Lemma 14.91 

Proof. Recalling that X' =^{XJ taking the average of the results from Lemma 14.71 and 

Corollary 14.81 gives 

D D 
j=l j,k=l 
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where 



(1) 



i,n 



(2) 



i,n 



l{f,{xl)+Mx^))-n{xi)) At, 

D 



>Jj,n, 



j,k=l 
D 



Setting 



j,k=l 



it is clear that E[M„ | J>j] = 0, and all that remains is to bound the magnitude of E[ ||M„||p] 
and E[ HA'np]. By performing second order Taylor series expansions for f{x) and g{x), and first 
order expansions for all about X^, we obtain 



N 



(1) 



1 



- y 

16 ^ 



^ \dxjdxk dxjdxk 

j,k=l ■' ■' 



M 



(1) 



7=1 k,l=l \ ' K I / 



M. 



(2) 



(3) 



A ^ ^ \ dxi 

j,k=i 1=1 ^ 

D d 



for some ^1,^3)^5)^7 between X^ and X„, and ^^2) ^4) Ce, ■^8 between X„ and X^. 

Using the same arguments as in the final part of the proof of Lemma 14. 7t together with the 
bounds on E[ \\mI\\% E[ ||M^||p], E[ \\nI\\p] and E[ \\N^\\% leads to the required bounds for the 
moments of M„ and A'., . □ 



A. 4 Proof of Theorem 14.101 



Proof. If we define S'n = E 



max||xi,-X^||P 



then inequality (|2.4p gives 



Sn < dP-^ ^E 



max 



(A.l) 
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Taking the difference between the equation in Lemma 14.91 and equation (j4.2p , and summing 
over the first m timesteps we obtain 

m—l 



i,m i,m 



= ^ [fiixl) - MXli)) At 

1=0 

m-l D 

1=0 j=l 

m-l D 

+ E E {f'iJkixil) - hijkiXfi)'^ {Awj^iAwk,i - njkAt) 

1=0 j,k=l 

m—l m—l 
Z=0 1=0 



and using inequahty ()2.4p again gives 



E 



max 

m<n 



+ E 



+ E 



+ E 



max 

m<n 



max 

m<n 



max 

m<n 



max 



m—l 



E {hixii) - hixiS) At 

1=0 

m-l D 



1=0 j=l 
m-l D 



1=0 j,fc=l 
m—l 

1=0 



E E (f^iMxii) - hijkiXl^)] {Awj,iAwk,i - ftjkAt) 



+ E 



max 

m<n 



m—l 
1=0 



id 



(A.2) 



We now need to bound each of the five expectations on the right-hand side of (|A.2p . The last 
is the easiest, since 



m—l 

E^. 

1=0 



m—l 



n-1 



1=0 



1=0 



and therefore 



E 



max 

m<n 



m—l 



E^M 



1=0 



n — 1 



< nP-^Y,E[\N^\P] < ci{nAt)P AtP 



1=0 



for some constant ci (which hke other such constants in this proof will depend on p, L and T 
but not on At) due to Lemma [ 



Similarly, there exists a constant C2 such that 



E 



max 

m<n 



m—l 



E (Mxli) - fiiXli)) At 



m—l 



1=0 

n-1 



AtP 



m=0 
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with the second step being due to the uniform bound on the first derivatives of /. 



The other three expectations in ()A.2p involve martingales, and so we can use the discrete 
Burkholder-Davis-Gundy inequality [1]. Starting again with the easiest, there are constants C3, 
C4 such that 





m—1 


p- 






max 

m<n 






<C3E 


( E iM^,mA 




1=0 






\m=0 / 



p/2" 



E 

with the final step being due to Lemma [ 
Similarly, there exists a constant C5 such that 



n-l 



m=0 



E 



max 

m<n 



m-l D 



1=0 j=l 



n-l D 



m=0 j=l 



Since Awj^m is independent of both Xj „ and Xf^^, it follows that 



E 



9ij{x[m) - 9ij{Xlm) ) ^Wj 





p- 






^ Awj^rn 




= E 





Hence, because of the uniformly bounded first derivatives of g, and standard results for the 
moments of Brownian increments, there exists a constant such that 



E 



max 



m-l D 



Y,Y.{9^,{xl)-9^,{XlS)A 



1=0 j=l 



n-l 



< C6(nAtf/2-i 5^ At. 



m=0 



Finally, following the same approach, there exists a constant C7 such that 

p 



E 



max 

m<n 



m-l D 



Yl Yl ('^ijkixli)-hijkiXli)) {Awj^iAwk^i-VLjkAt) 



1=0 j,k=l 



n-l 



< C5 (n At)P/2-i Y Sm At. 



m=0 



Since nAt < T in all of the above inequalities, combining the above bounds for each term in 
()A.2p . and inserting these into (jA.ip . there then exists a constant such that 

Sn<C8 (AtP+Y^rnAt \ . 

V m=0 / 



The desired result is then obtained from a discrete Gronwall inequality. 



□ 
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A. 5 Proof of Lemma 14.111 



Proof. The identity xf^, - X«^, = (xf^, - xl) + {xl - X^) + {X- - X^^ J gives 



n+4 



X^, 1 -X\^ 



n+ 



+ 



+ 



^„ I 1 — 



n+ 



It then fohows from Lemma 14.51 and Lemma 14.61 that there exists a constant Kp, independent 
of both At and n, for which 

PI 



E 



X^ 1 -X". 



□ 



A. 6 Proof of Lemma 14.121 

Averaging the discrete equations for X-^ , and X" i , and using the identities bwn = \^Wn + 
\{6wn - ^w^j^i) and (J'tf^+i = \/\wn - \ {6wn - Sw^j^i), gives 

D 

x[n+\ = ^In + -2 At + i J] g^,{X^n) Au;,,„ + X,,„, (A.3) 

where 

iv^,n = i(ia(x/) + /,(x;j))-/,(xi))At 

D 
D 

+ ^ E [hijkiXl){5wj^nSwk,n - l0.jkAt) + /iijfc(X°)((5w^._„^i5z/;^ - ^Qj-fcAt)^ . 
j,k=i 

Following the same method of analysis as in the proof of Lemma 14.71 it can be proved that 
E[|Xi,„|P] has an 0{AtP) bound. 

Next, defining X^_|_i to be the linear interpolant value |(X^ + X^_^_^), then the equation for 
X^^^ yields 

d d 
j=l j,k=l 

(A.4) 
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Subtracting (|A.4p from ()A.3P gives 



d 

Using the bounds on E[||X^ — the bounded first derivatives of f{x) and g{x), the uniform 
bound on E[ and standard results for Brownian increments, we can conclude that 

there exists a constant Kp, independent of both At and n, such that such that 

< Kp AtP. 



E 



X 



X 



i,n+4 
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